Exploration and Exploitation in Parkinson’s Disease: Behavioral Analyses

Authors
Affiliations

Björn Meder

Health and Medical University, Potsdam, Germany

Martha Sterf

Medical School Berlin, Berlin, Germany

Charley M. Wu

University of Tübingen, Tübingen, Germany

Matthias Guggenmos

Health and Medical University, Potsdam, Germany

Published

June 19, 2025

Code
# Housekeeping: Load packages and helper functions
# Housekeeping
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(message = FALSE)
knitr::opts_chunk$set(warning = FALSE)
knitr::opts_chunk$set(fig.align='center')
knitr::opts_chunk$set(prefer_html = TRUE)

options(knitr.kable.NA = '')

packages <- c('gridExtra', 'BayesFactor', 'tidyverse', "RColorBrewer", "lme4", "sjPlot", "lsr", "brms", "kableExtra", "afex", "emmeans", "viridis", "ggpubr", "hms", "scales", "cowplot", "gtsummary", "webshot", "webshot2", "parameters", "bridgesampling")
lapply(packages, require, character.only = TRUE)

set.seed(0815)

# file with various statistical functions, among other things it provides tests for Bayes Factors (BFs)
source('statisticalTests.R')

# Wrapper for brm models such that it saves the full model the first time it is run, otherwise it loads it from disk
run_model <- function(expr, modelName, path='brm', reuse = TRUE) {
  path <- paste0(path,'/', modelName, ".brm")
  if (reuse) {
    fit <- suppressWarnings(try(readRDS(path), silent = TRUE))
  }
  if (is(fit, "try-error")) {
    fit <- eval(expr)
    saveRDS(fit, file = path)
  }
  fit
}

# Setting some plotting params
w_box          <- 0.2      # width of boxplot, also used for jittering points and lines    
line_jitter    <- w_box / 2
xAnnotate      <- -0.3

# jitter params
jit_height  <- 0.01
jit_width   <- 0.05
jit_alpha   <- 0.6

# colors 
groupcolors    <- c("#1b9e77", "#d95f02", "#7570b3")
choice3_colors <- c("#e7298a", "#66a61e", "#e6ab02")
Code
########################################################
# get behavioral data
########################################################
dat_gridsearch <- read_csv("data/data_gridsearch_Parkinson.csv", show_col_types = FALSE) %>% 
  mutate(type_choice  = factor(type_choice, levels = c("Repeat", "Near", "Far")))

########################################################
# get bonus round data
########################################################
dat_bonus   <- read_csv(file="data/data_gridsearch_Parkinson_bonusround.csv") %>% 
  mutate(bonus_environment = as.factor(bonus_environment))

########################################################
# get subject data
########################################################
dat_sample <- read_delim("data/data_gridsearch_subjects.csv", escape_double = FALSE, trim_ws = TRUE, show_col_types = FALSE) %>% 
  mutate(gender = as.factor(gender),
         group  = factor(group, levels = c("PNP", "PD+", "PD-"))) %>% 
  mutate(last_ldopa = if_else(group != "PNP", as_hms(last_ldopa), as_hms(NA)),
         next_ldopa = if_else(group != "PNP", as_hms(next_ldopa), as_hms(NA)),
         time_exp = if_else(group != "PNP", as_hms(time_exp), as_hms(NA))) %>% 
  mutate(time_since_ldopa = as.numeric(time_exp - last_ldopa, unit = "mins"))

# combine behavioral and subject data
dat <- dat_sample %>% 
  left_join(dat_gridsearch, by = "id") %>% 
  arrange(group)

1 Abstract

We investigated how patients with Parkinson’s disease (PD) balance the explore-exploit trade-off using a spatially correlated bandit task, where the spatial structure of rewards facilitated value generalization (i.e., nearby options yield similar rewards). Participants were tested either shortly after taking their regular Levodopa (L-Dopa) dose (N=29) or just before their next scheduled dose (N=26). Patients with polyneuropathy served as a control group (N=33), comparable in age, depressive symptoms, and basic cognitive functioning. Behavioral and computational analyses revealed distinct patterns of exploration and exploitation. PD patients on L-Dopa balanced exploration and exploitation, though not as efficiently as polyneuropathy patients. In stark contrast, patients off L-Dopa rarely exploited known high-value options and primarily explored novel ones. This overreliance on exploration impaired their ability to navigate the explore-exploit trade-off and maximize rewards. To better understand the mechanisms underlying these behavioral differences, we employed a computational approach using the Gaussian Process Upper Confidence Bound (GP-UCB) model. This model integrates similarity-based generalization with two distinct exploration mechanisms: directed exploration, which seeks to reduce uncertainty about rewards, and random exploration, which introduces stochastic variability in choice behavior. The model parameters showed that behavioral differences between the on- and off-medication conditions were primarily driven by differences in uncertainty-directed exploration, while the level of random exploration remained unchanged. Both PD groups showed reduced generalization compared to controls, contributing to poorer overall performance. Our findings indicate that L-Dopa selectively modulates uncertainty-directed exploration, providing a more nuanced understanding of the central role of dopamine in the regulation of exploratory behavior.

2 Intro

A central distinction between different forms of exploration behavior is that directed exploration reflects the drive for knowledge about novel options, whereas undirected exploration refers to random variability in the choice process (Giron et al., 2023; Meder et al., 2021; Sadeghiyeh et al., 2020; Schulz et al., 2019; Wu et al., 2018).

[TO BE CONTINUED]

3 Experiment

We investigated how patients with Parkinson’s disease (PD) manage the explore-exploit trade-off using a spatially correlated multi-armed bandit task. Participants accumulated rewards by selecting tiles (options) with normally distributed rewards. The spatial correlation between rewards facilitated generalization, allowing participants to adapt to the structure of the environment and balance exploring new options versus exploiting known high-reward options.

Screenshot from experiment and example environments (from Giron et al., 2023)

Screenshot from experiment and example environments (from Giron et al., 2023)

3.1 Materials and procedure

40 distinct environments were generated using a radial basis function kernel with \(\lambda = 4\), creating a bivariate reward function on a grid that maps each tile location to a specific reward value. These smooth reward functions gradually varied across the grid, creating environments with spatially-correlated rewards.

Participants completed 10 rounds of the task, each featuring a new environment drawn without replacement from the set of 40 environments. In each round, participants had 25 choices to maximize rewards. The first round served as a tutorial to familiarize participants with the task and was excluded from the analyses. The final round (round 10) was a bonus round where, after 15 choices, participants were asked to predict rewards for five unrevealed options. Data from this round were also excluded from the main analysis and analyzed separately.

At the start of each round, one tile was randomly revealed, and participants sequentially sampled 25 tiles. On each trial, they could choose to either click a new tile or re-click a previously selected tile. Selections were made by selecting the tile on the computer screen, upon which the received a reward arbitrarily scaled to the range [0,50]. Re-clicked tiles showed small variations in reward due to normally distributed noise.

3.2 Sample

We collected data from adult participants with Parkinson’s disease (PD) who regularly receive Levodopa (L-Dopa) for treatment (Abbott, 2010; Tambasco et al., 2018). Participants were recruited via a neurologist’s outpatient practice. Eligible participants were evaluated based on Hoehn-Yahr scores recorded in their patient files. The scale assesses disease severity and motor impairments based on a score from 1 to 5, with higher scores indicating greater severity (Goetz et al., 2004; Hoehn & Yahr, 1967). We limited recruitment to individuals with scores between 1 and 3, as scores of 4 and 5 reflect severe impairment

PD patients were randomly assigned to two conditions: on medication (PD+) and off medication (PD-). In the PD+ group (N=29), patients’ scheduled L-Dopa was administered at least 30 minutes before the start of the experiment. In the PD- group (N=26), the next scheduled dose for participants was timed such that they were in a low dopamine state during the experiment, offering a clear contrast to the PD+ group. Thus, we refer to the ‘on medication’ condition as the state after taking L-Dopa and the ‘off medication’ condition as the state before their next scheduled dose.

The comparison group (N=33) consisted of individuals with polyneuropathies (PNP), which is associated with physical symptoms similar to the motor impairments seen in Parkinson’s disease. However, since polyneuropathy primarily affects peripheral nerves it is typically not associated with cognitive impairments, enabling a comparison in terms of physical symptomatology and the resulting burden of suffering.

3.3 Clinical assessment

To characterize participants’ clinical status, we employed standardized measures assessing Parkinson’s disease severity, basic cognitive function, and depressive symptoms. PD severity was evaluated using the Hoehn-Yahr scale, which rates motor impairments such as postural instability and gait difficulties (Hoehn & Yahr, 1967). Participants can receive a score between one and five, with higher scores indicating more severe problems. Basic cognitive function of all participants was assessed through the Mini-Mental State Examination (MMSE), which is frequently used in in patients with dementia (Folstein et al., 1975). The test comprises 30 questions pertaining to different domains, including memory (e.g., recalling three objects), temporal and spatial orientation (e.g., date and location), and arithmetic ability. Finally, all participants answered the German version of the Beck Depression Inventory II, a self-report inventory consisting of 21 items measuring depressive symptoms (Beck et al., 1996; Hautzinger et al., 2006).

3.4 Sample characteristics

Table 1 shows the demographics of our sample, along with their Hoehn-Yahr, MMSE , and BDI scores. In the PD+ group, the mean time since their last L-Dopa dose was 102 min; in the PD- group it was 244min.

Code
dat_sample %>% 
  group_by(group) %>% 
  summarise(n = n(),
            female = sum(gender == "f"),
            mean_age = mean(age),
            sd_age = sd(age),
            mean_BDI = mean(BDI, na.rm= T), 
            mean_MMSE = mean(MMSE, na.rm= T),
            mean_HY = mean(hoehn_yahr, na.rm= T),
            mean_time_since_ldopa = mean(time_since_ldopa,  na.rm= T)) %>% 
  
  kable(., format = "html", escape = FALSE, digits = 1) %>%
  kable_styling("striped", full_width = FALSE)
Table 1
group n female mean_age sd_age mean_BDI mean_MMSE mean_HY mean_time_since_ldopa
PNP 33 13 65.2 7.1 8.3 28.9
PD+ 29 11 63.2 6.1 8.3 29.1 1.9 102.2
PD- 26 9 66.1 7.1 7.5 28.9 2.0 243.7

Demographics and clinical assessment.

4 Behavioral data

All behavioral data are stored in data_gridsearch_parkinson.csv, which contains the following variables (Table 2):

  • id: participant id
  • age is participant age in years
  • gender: (m)ale, (f)emale, (d)iverse
  • x and y are the sampled coordinates on the grid
  • chosen: are the x and y coordinates of the chosen tile
  • z is the reward obtained from the chosen tile, 0-50. Re-clicked tiles could show small variations in the observed color (i.e., underlying reward) due to normally distributed noise,\(\epsilon∼N(0,1)\).
  • z_scaled is the observed outcome (reward), scaled in each round to a randomly drawn maximum value in the range of 70% to 90% of the highest reward value
  • trial is the trial number (0-25), with 0 corresponding to the initially revealed random tile, i.e. trial 1 is the first choice
  • round is the round number (1 through 10), with 1=practice round (not analyzed) and 10=bonus round (analyzed only for bonus round judgments)
  • distance is the Manhattan distance between consecutive clicks. NA for trial 0, the initially revealed random tile
  • type_choice categorizes consecutive clicks as “repeat” (clicking the same tile as in the previous round), “near” (clicking a directly neighboring tile, i.e. distance=1), and “far” (clicking a tile with distance > 1). NA for trial 0, i.e., the initially revealed random tile.
  • previous_reward is the reward z obtained on the previous step. NA for trial 0, i.e., the initially revealed random tile.
  • last_ldopa: time of the last L-Dopa dose (HH:MM)
  • next_ldopa: scheduled time of the next L-Dopa dose (HH:MM)
  • time_exp: time of the experiment (HH:MM)
  • time_since_ldopa: time since last L-Dopa (in minutes)

We analyzed the behavioral data in terms of performance and exploration behavior. These analyses exclude the tutorial and bonus rounds, leaving a total of 200 search decisions (8 rounds \(\times\) 25 trials) for each participant. We then report the results of the bonus round, where we analyze participants’ reward predictions and confidence judgments. We report both frequentist statistics and Bayes factors (\(BF\)) to quantify the relative evidence of the data in favor of the alternative hypothesis (\(H_A\)) over the null hypothesis (\(H_0\)); see Appendix for details and references. Various helper functions are implemented in statisticalTests.R. Regression analyses were performed in a Bayesian framework with Stan, accessed via R-package brms, complemented by frequentist hierarchical regression analyses (via package lmer).

Code
# show example data
head(dat %>%
       group_by(group) %>%
       slice_head(n=25)  %>% 
       ungroup()) %>%
  kable("html", caption = "Example behavioral data.") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>% 
  scroll_box(width = "100%", height = "300px")
Table 2
Example behavioral data.
id age gender group BDI MMSE hoehn_yahr last_ldopa next_ldopa time_exp time_since_ldopa session x y chosen z zscaled time trial round distance type_choice previous_reward
111 54 f PNP 12 30 1 3 3 28 19 18 1.719326e+12 0 1
111 54 f PNP 12 30 1 3 3 28 19 18 1.719326e+12 1 1 0 Repeat 19
111 54 f PNP 12 30 1 3 4 36 26 22 1.719326e+12 2 1 1 Near 19
111 54 f PNP 12 30 1 4 5 45 32 26 1.719326e+12 3 1 2 Far 26
111 54 f PNP 12 30 1 5 6 54 50 38 1.719326e+12 4 1 2 Far 32
111 54 f PNP 12 30 1 6 7 63 13 14 1.719326e+12 5 1 2 Far 50

Example data.

4.1 Performance: Rewards by round

Code
# mean reward per subject (practice and bonus round excluded)
df_mean_reward_subject_by_round <- dat %>% 
  filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
  group_by(id, round) %>% 
  summarise(age = mean(age),
            group = first(group),
            sum_reward = sum(z),
            mean_reward = mean(z), 
            sd_reward = sd(z)) 

df_summary_by_round <- df_mean_reward_subject_by_round %>%
  group_by(round, group) %>%
  summarize(
    mean_of_means = mean(mean_reward, na.rm = TRUE),  # Renaming to avoid confusion
    se_reward = sd(mean_reward, na.rm = TRUE) / sqrt(n()),  # Standard error
    .groups = 'drop'
  )

aov_rounds <- aov_ez(
  id = "id",                 
  dv = "mean_reward",        
  within = "round",          
  between = "group",         
  data = df_mean_reward_subject_by_round
)


# kable(as.data.frame(aov_rounds$anova_table), 
#       format = "html", escape = FALSE, digits = 2, 
#       caption = "ANOVA results with round as within-subjects factor and group as between subjects factor, where rewards per round were first aggregated within subjects.") %>%
#   kable_styling("striped", full_width = FALSE)


brm_rounds <- run_model(brm(
  mean_reward ~ round * group + (1 | id),   # Random intercept for subject
  data = df_mean_reward_subject_by_round,   
  family = gaussian(),                      
  iter = 4000,                              
  warmup = 1000,                            
  chains = 4,                               
  cores = 4,                                
  seed = 0511,
  save_pars = save_pars(all = TRUE)),
  modelName = 'brm_reward_rounds')

# Extract fitted values and add to data df
fitted_values <- fitted(brm_rounds, re_formula = NA)
df_mean_reward_subject_by_round$fitted_mean_reward <- fitted_values[, "Estimate"]

p <- ggplot(df_mean_reward_subject_by_round, aes(x = round, y = mean_reward, group = group, shape = group, color = group)) +
  geom_point(data = df_summary_by_round, aes(x = round, y = mean_of_means, shape = group), size = 3) +
  geom_line(aes(y = fitted_mean_reward), linewidth = 1) +  
  geom_jitter(aes(x = round, y = mean_reward), size = 1, alpha = 0.3, width = 0.2) +
  scale_y_continuous("Mean Reward", breaks = c(25,30,35)) +
  xlab("Round") +
  scale_fill_manual(values = groupcolors) +
  scale_color_manual(values = groupcolors) +
  ggtitle("Mean Reward by Rounds and Group (brms)") +
  theme_classic() +
  theme(legend.title = element_blank())

# tbl_regression(brm_rounds, exponentiate = F) 
#tab_model(brm_rounds)

# Reduced model for computing BF: no round term
brm_rounds_reduced <- run_model(brm(
  mean_reward ~ group + (1 | id),
  data = df_mean_reward_subject_by_round,
  family = gaussian(),
  iter = 4000, 
  warmup = 1000, 
  chains = 4, 
  cores = 4, 
  seed = 0511,
  save_pars = save_pars(all = TRUE)),
modelName='brm_reward_rounds_reduced')

# Compute Bayes Factor: Full vs. Reduced
#bf_brm_rounds <- bayes_factor(brm_rounds, brm_rounds_reduced)


# format_parameters(brm_rounds)
# 
# params <- model_parameters(brm_rounds) |> print_md()
# params$Parameter <- gsub("groupPDP", "PD+", params$Parameter)
# params$Parameter <- gsub("groupPDP", "PD+", params$Parameter)
# params$Term <- gsub("round", "Round", params$Term)
# 
# 
# params$Term <- gsub("groupPDP", "PD", params$Term)

# plot_model(brm_rounds, type = "est") +
#   theme_classic()

Figure 1 shows the obtained rewards by round, for each group. An ANOVA with round as within- and group as between-subjects factor showed a difference between groups, with PNP patients achieving the greatest rewards, followed by PD+ and PD- patients, but no change across rounds (and no interaction). A Bayesian regression analysis yielded comparable results, with the estimated effect of round on mean reward being very small (Estimate = -0.05, 95% CI [-0.35, 0.24]) and the credible interval including zero. In the subsequent analyses, we therefore aggregate across rounds.

Code
# Plot the mean reward by round for each group with dodged points and error bars
ggplot(df_summary_by_round, aes(x = round, y = mean_of_means, group = group, shape = group,  color = group, fill = group)) +
  geom_line(position = position_dodge(width = 0.3)) +  
  # geom_errorbar(aes(ymin = mean_of_means - 1.96 * se_reward, ymax = mean_of_means + 1.96 * se_reward), width = 0.2, position = dodge, color = "black") +  # 
  geom_errorbar(aes(ymin = mean_of_means - se_reward, ymax = mean_of_means + se_reward), width = 0.2, position = position_dodge(width = 0.3), alpha=0.7) +  #
  geom_point(position = position_dodge(width = 0.3), size = 3, stroke = 1, alpha=.9) +  
  coord_cartesian(ylim = c(20,40)) +
  #scale_shape_manual(values = c(21, 24, 22)) +  # circle, triangle, and square
  scale_fill_manual(values = groupcolors) +  
  scale_color_manual(values = groupcolors) +  
  #scale_color_manual(values = c("black","black","black")) +  
  # scale_y_continuous("Mean reward ± 95% CI") +
  scale_y_continuous("Mean reward ± SE") +
  scale_x_continuous("Round", breaks = 2:9) +
  theme_classic() +
  theme(legend.title = element_blank())

ggsave("plots/performance_rounds.png", width = 6, height = 3)
Figure 1: Performance over rounds (excluding tutorial and bonus round).

4.2 Performance: Rewards by group

Code
df_mean_reward_subject <- dat %>% 
  filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
  group_by(id) %>% 
  summarise(age = mean(age),
            group = first(group),
            sum_reward = sum(z),
            mean_reward = mean(z), 
            sd_reward = sd(z),
            BDI = first(BDI),          
            MMSE = first(MMSE),  
            hoehn_yahr = first(hoehn_yahr))

# some summary stats for obtained mean rewards
# df_mean_reward_subject %>%
#   group_by(group) %>%
#   summarise(n = n(),
#             m_reward = mean(mean_reward),
#             md_reward = median(mean_reward),
#             var_reward = var(mean_reward),
#             sd_reward = sd(mean_reward),
#             se_reward = sd_reward / sqrt(n),
#             lower_ci_reward = m_reward - qt(1 - (0.05 / 2), n - 1) * se_reward,
#             upper_ci_reward = m_reward + qt(1 - (0.05 / 2), n - 1) * se_reward) %>%
#   
#   kable(., format = "html", escape = FALSE, digits = 2) %>%
#   kable_styling("striped", full_width = FALSE)

Figure 2 shows the overall performance of each group, based on each subject’s mean reward across all trials. PNP participants achieved higher rewards than both PD patients on medication (\(t(60)=2.4\), \(p=.019\), \(d=0.6\), \(BF=2.8\)) and off medication (\(t(57)=6.9\), \(p<.001\), \(d=1.8\), \(BF>100\)). Notably, PD patients on medication achieved substantially higher rewards than patients off medication (\(t(53)=5.8\), \(p<.001\), \(d=1.6\), \(BF>100\)), indicating as strong beneficial effect of L-Dopa on the ability to balance exploration and exploitation.

Code
# Boxplots of rewards by group
ggplot(df_mean_reward_subject, aes(x = group, y = mean_reward, color = group, fill = group, shape = group)) +
  geom_boxplot(alpha = 0.2, outlier.shape = NA) +  
  geom_jitter(width = 0.15, size = 2, alpha = 0.8) +  
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 2) +  
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  ylab("Mean reward") +
  xlab("") +
  ggtitle("Performance") +
  theme_classic() +
  theme(
    strip.background = element_blank(),
    strip.text = element_text(color = "black", size = 12),
    legend.position = 'none'
  )

ggsave("plots/performance_by_group.png", dpi=300, height = 5, width = 6 )
Figure 2: Obtained rewards by group. Each dot is one participants’ mean reward across all rounds and trials.
Code
### Comparison of groups
#We next conduct pairwise comparisons of the performance across patient groups. Pateintes with polyneuropathy (PNP) performed better than both groups of Parkinson patients. Parkinson patient on medication (PD+) performed better than patients off medication (PD-).

#### PNP vs. PD+
# t.test(subset(df_mean_reward_subject, group == 'PNP')$mean_reward, subset(df_mean_reward_subject, group == 'PD+' )$mean_reward, var.equal = T)
# ttestBF(subset(df_mean_reward_subject, group == 'PNP')$mean_reward, subset(df_mean_reward_subject, group == 'PD+')$mean_reward)

#### PNP vs. PD-
# t.test(subset(df_mean_reward_subject, group == 'PNP')$mean_reward, subset(df_mean_reward_subject, group == 'PD-' )$mean_reward, var.equal = T)
# ttestBF(subset(df_mean_reward_subject, group == 'PNP')$mean_reward, subset(df_mean_reward_subject, group == 'PD-')$mean_reward)

#### PPD# vs. PD-
# t.test(subset(df_mean_reward_subject, group == 'PD+')$mean_reward, subset(df_mean_reward_subject, group == 'PD-' )$mean_reward, var.equal = T)
# ttestBF(subset(df_mean_reward_subject, group == 'PD+')$mean_reward, subset(df_mean_reward_subject, group == 'PD-')$mean_reward)

4.3 Performance: Learning curves

Participants’ learning curves (Figure 3) show the average reward obtained in each trial across rounds. For both polyneuropathy patients (PNP) and PD patients on medication (PD+), the mean rewards increased as the round progresses, suggesting they effectively balanced exploration and exploitation to maximize rewards. In stark contrast, PD patients off medication (PD-) showed no improvement across trials.

TO DO: Add random reward as baseline

Figure 3: Learning curves, showing obtained mean reward for each trial, aggregated across rounds.

4.3.1 Performance: Role of physiological and cognitive assessments (BDI, MMSE, Hoehn-Yahr)

We also assessed patients in terms of their depressive symptoms (via BDI-II), cognitive functioning (via Mini-Mental-Status Examination, MMSE), and severity of motor symptoms (via Hoehn-Yahr scale, Parkinson’s disease patients only). We ran a hierarchical regression with reward as dependent variable and group, BDI score, and MMSE score; with random intercepts for participants to account for individual differences. This analysis yielded only an effect of group, suggesting that BDI and MMSE score were not related to performance.

TO DO: Keep eye on MMSE scores, approaching significance

Code
# Hierarchical frequentist regression with random intercept: Reward as function of BDI and MMSE score (all patients)    
lmer_performance_BDI_MMSE <- lmer(z ~ group + BDI + MMSE + (1 | id), 
                                  data = subset(dat, trial > 0 & round %in% 2:9))

#summary(lmer_reward_BDI_MMSE)
tab_model(lmer_performance_BDI_MMSE, title = "Hierarchical regression results: Performance as function of BDI and MMSE score.", bpe="mean")
Hierarchical regression results: Performance as function of BDI and MMSE score.
  z
Predictors Estimates CI p
(Intercept) 9.84 -15.21 – 34.89 0.441
group [PD+] -2.33 -3.98 – -0.67 0.006
group [PD-] -6.34 -8.03 – -4.64 <0.001
BDI -0.01 -0.21 – 0.18 0.903
MMSE 0.82 -0.04 – 1.67 0.060
Random Effects
σ2 152.10
τ00 id 9.85
ICC 0.06
N id 87
Observations 17400
Marginal R2 / Conditional R2 0.042 / 0.100
Code
# Hierarchical Bayesian regression with random intercept: Reward as function of BDI and MMSE score (all patients)                              
brm_performance_BDI_MMSE <- run_model(brm(z ~ group + BDI + MMSE + (1|id),
                                          data=subset(dat, trial > 0 & round %in% 2:9 ),
                                          cores=1,
                                          seed = 0815,
                                          iter = 5000,
                                          warmup=1000,
                                          control = list(adapt_delta = 0.99, max_treedepth = 15)),
                                      #prior = prior(normal(0,10), class = "b")),
                                      modelName = 'brm_performance_BDI_MMSE')
#tab_model(brm_performance_assessment, bpe="mean", title = "Hierarchical Bayesian regression: Performance as function of BDI and MMSE score.") 
#bayes_R2(brm_performance_assessment) 
#tab_model(lmer_performance_BDI_MMSE, brm_performance_BDI_MMSE, title = "Hierarchical regression results: Performance as function of BDI and MMSE score.", bpe="mean")

Next, we ran a hierarchical regression for Parkinson’s patients only, with reward as dependent variable and group, BDI, MMSE, and Hoehn-Yahr score as predictors; with random intercepts for participants to account for individual differences. This analysis only yielded an influence of group, i.e. being on or off L-Dopa.

Code
# Hierarchical frequentist regression with random intercept: Reward as function of BDI, MMSE, and Hoehner-Yahr score (Parkinson's patients only)   
lmer_reward_PD_only_BDI_MMSE_HY <- lmer(z ~ group + BDI + MMSE + hoehn_yahr + (1 | id), 
                                        data = subset(dat, trial > 0 & round %in% 2:9 & group != "PNP"))

#summary(lmer_reward_PD_only_BDI_MMSE_HY)

tab_model(lmer_reward_PD_only_BDI_MMSE_HY, title = "Hierarchical regression results: Performance of patients with Parkinson's disease as function of BDI, MMSE, and Hoehn-Yahr score.",  bpe="mean")
Hierarchical regression results: Performance of patients with Parkinson's disease as function of BDI, MMSE, and Hoehn-Yahr score.
  z
Predictors Estimates CI p
(Intercept) 22.42 -6.72 – 51.56 0.132
group [PD-] -4.04 -5.50 – -2.57 <0.001
BDI 0.12 -0.08 – 0.33 0.244
MMSE 0.25 -0.71 – 1.22 0.604
hoehn yahr 0.18 -0.93 – 1.30 0.748
Random Effects
σ2 148.48
τ00 id 6.59
ICC 0.04
N id 55
Observations 11000
Marginal R2 / Conditional R2 0.029 / 0.070
Code
# Hierarchical Bayesian regression with random intercept: Reward as function of BDI, MMSE, and Hoehner-Yahr score (Parkinson's patients only)                          
brm_performance_PD_only_BDI_MMSE_HY <- run_model(brm(z ~ group + BDI + MMSE + (1|id),
                                                     data=subset(dat, trial > 0 & round %in% 2:9 & group != "PNP"),
                                                     cores=1,
                                                     seed = 0815,
                                                     iter = 5000,
                                                     warmup=1000,
                                                     control = list(adapt_delta = 0.99)),
                                                 #prior = prior(normal(0,10), class = "b")),
                                                 modelName = 'brm_performance_PD_only_BDI_MMSE_HY')

# tab_model(lmer_reward_PD_only_BDI_MMSE_HY, brm_performance_PD_only_BDI_MMSE_HYtitle = "Hierarchical regression results: Performance of patients with Parkinson's disease as function of BDI, MMSE, and Hoehn-Yahr score.",  bpe="mean")

4.4 Exploration vs. exploitation choices

To investigate the temporal dynamics of exploration and exploitation, we determined for each trial whether the chosen tile was novel (an exploration decision) or had already been selected previously (an exploitation decision). Intuitively, at the beginning of each round learners should predominantly engage in exploration to identify high-reward options, and gradually shift toward exploitative behavior as they approach the end of the round.

Code
# proportion of unique choices per round per subject
df_unique_choices_round <- 
  dat %>%
  filter(round %in% 2:9 & trial > 0) %>% 
  group_by(id,group, round) %>%  
  summarize(
    total = n(),  #  number of trials
    unique_tiles = n_distinct(x, y),  # unique (x, y) combinations (i.e., tiles)
    repeat_tiles = total - unique_tiles
  ) %>% 
  mutate(prop_unique = unique_tiles/total,
         prop_repeat = repeat_tiles/total) 

# proportion of unique choices across 8 rounds per subject
df_unique_choices_subject <- df_unique_choices_round %>% 
  group_by(id, group) %>% 
  summarize(m_prop_unique = mean(prop_unique),
            m_prop_repeat = mean(prop_repeat))

dat <- dat %>%
  group_by(id, round) %>%
  arrange(trial, .by_group = TRUE) %>%  # Ensure data is sorted by trial
  mutate(
    is_new = factor(if_else(!duplicated(chosen), "new", "repeat"))  # Check uniqueness based on 'chosen' column
  ) %>%
  ungroup()

dat_repeat_prop <- dat %>%
  filter(trial > 0 & round %in% 2:9) %>% 
  group_by(id, group, trial) %>%
  summarize(
    prop_repeat = mean(is_new == "repeat", na.rm = TRUE)  # Calculate proportion of "repeat" (exploitation) choices
    # prop_new = mean(is_new == "new", na.rm = TRUE)  # Calculate proportion of "new" (exporation) choices
  )

Figure 4 shows that both PNP and PD+ patients increased the amount of exploitation over time, indicating a goal-directed shift from exploring novel options to exploiting known high-value options. The PNP group began exploiting earlier in the round and exhibited a stronger overall tendency toward exploitation compared to the PD+ group, indicating that this earlier focus on exploitation underlies their better performance. In stark contrast, PD- patients predominantly engaged in exploration and showed only a weak tendency towards exploitation as the search horizon approached its end. This pattern is also reflected in the overall proportion of exploitation decisions (Figure 4, inset). PNP patients made more exploitation decisions than PD+ patients (\(t(60)=2.7\), \(p=.010\), \(d=0.7\), \(BF=4.8\)), who exploited more than PD- patients (\(t(53)=4.5\), \(p<.001\), \(d=1.2\), \(BF>100\)). Notably, PD- patients almost exclusively selected novels options during the task and only rarely exploited known options. These distinct behavioral patterns show how suboptimal balance of exploration and exploitation affects obtained rewards.

Code
# Main plot
p_main <- ggplot(dat_repeat_prop, aes(x = trial, y = prop_repeat, fill = group, shape = group, color = group)) +
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2, alpha = 0.5, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "point", size = 3, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "line", position=position_dodge(width=0.5)) +  
  scale_fill_manual(values = groupcolors) + 
  scale_color_manual(values = groupcolors)+ 
  scale_y_continuous(labels = percent_format(accuracy = 1)) + 
  labs(
    x = "Trial",
    y = "Exploitation choices (Mean proportion ±95% CI)",
    title = "Exploration and exploitation over time"
  ) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.position = c(0.01, 0.35),  
        legend.justification = c(0, 1),
        legend.title = element_blank()
  )

# Inset plot
p_inset <-  ggplot(df_unique_choices_subject, aes(x = group, y = m_prop_repeat, color = group, fill = group, shape = group)) +
  geom_boxplot(alpha = 0.2, size = 0.5, outlier.shape = NA) +  
  geom_jitter(width = 0.15, size = 0.8) +  
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 2) +
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  scale_y_continuous("", 
                     breaks = c(0, 0.5, 1), 
                     labels = percent_format(accuracy = 1)) +
  coord_cartesian(ylim = c(0, 1.25)) +
  xlab("") +
  ggtitle("   Total exploitation choices") +
  theme_classic() +
  theme(
    strip.background = element_blank(),  
    strip.text = element_text(color = "black", size = 12),
    legend.position = "none",
    axis.title.y = element_text(size = 8),
    axis.text.y = element_text(size = 8),
    plot.margin = margin(0, 0, 0, 0),
    plot.title = element_text(size = 9, margin = margin(b = -10))
  )


# Combine plots
ggdraw() +
  draw_plot(p_main) +
  draw_plot(p_inset, x = 0.075, y = 0.50, width = 0.35, height = 0.4)

ggsave("plots/explore_exploit_choices.png", dpi=300, width = 8, height = 4.5)

# ggboxplot(df_unique_choices_subject, 
#                    x = "group", 
#                    # y = "m_prop_unique",
#                    y = "m_prop_repeat",
#                    color = "group", palette = groupcolors, fill = "group", alpha = 0.2, size=0.2,
#                    add = "jitter", jitter.size = 0.0, shape = "group", title = "   Total exploitation choices") +
# scale_y_continuous("", , #"Exploration choices", 
#                    breaks = c(0, 0.5, 1), 
#                    # limits = c(0,1),
#                    labels = percent_format(accuracy = 1)) +
# coord_cartesian(ylim=c(0,1.25)) +
# xlab("") +
# ggtitle("") +
# stat_compare_means(comparisons = list( c("PNP", "PD+"), c("PD+", "PD-")  ),
#                    paired = FALSE, 
#                    method = "t.test", 
#                    aes(label = paste0("p = ", after_stat(p.format)))  ) +
# stat_summary(fun = mean, geom="point", shape = 23, fill = "white", size=2) +
# theme_classic() +
# theme(strip.background = element_blank(),  
#       strip.text = element_text(color = "black", size=12),
#       legend.position = "none",
#       axis.title.y = element_text(size=8),
#       axis.text.y = element_text(size=7),
#       plot.margin = margin(0, 0, 0, 0),
#       plot.title = element_text(size = 9, margin = margin(b = -10)), 
# )
Figure 4: Balancing exploration and exploitation. The main plot shows the mean proportion of exploitation decisions per trial, aggregated over rounds. The inset shows the total proportion of exploitation choices across all trials and rounds.

4.5 Spatial trajectories

We next consider participant’s spatial search trajectories (distance among consecutive clicks). Distance is measured as Manhattan distance between consecutive clicks, such that repeat clicks have distance 0, clicking directly neighbouring tiles has distance 1, and clicks further away have distances >1.

The most frequent choice was to select a neighboring tile (distance = 1), reflecting a local search approach (Wu et al., 2025). On average, PNP patients had the shortest distances, indicating more local searches and repeated clicks. PD+ patients had greater distances than PNP but shorter than PD- patients, who showed the highest distances. The distribution of distances shows that this is primarily due to the few repeat choices (distance = 0) they made, i.e. very limited exploitation behavior.

Figure 5: Distance among consecutive clicks.

PNP patients had lower search distances than the PD+ group (\(t(60)=-2.4\), \(p=.020\), \(d=0.6\), \(BF=2.7\)) and lower distances than the PD- group (\(t(60)=-2.4\), \(p=.020\), \(d=0.6\), \(BF=2.7\)). There was no difference between Parkinson patient with (PD+) and without (PD-) medication (\(t(53)=-0.8\), \(p=.420\), \(d=0.2\), \(BF=.36\)).

4.5.1 Types of choices

We can also categorize each consecutive click as “repeat” (clicking the same tile as in the previous round), “near” (clicking a directly neighboring tile, i.e. distance=1), or “far” (clicking a tile with distance > 1). We first computed for each participant the proportion of type of choices across all 8 rounds x 25 clicks = 200 search decisions and then plot the mean proportion for each group.

The analyses reveal distinct search patterns across patient groups. PNP participants had the highest proportion of repeat (exploit) decisions, followed by the PD+ group. The proportion of repeat decisions in the PD- group was minimal. These behaviors help explain the differences in learning curves, where PNP patients showed the most significant improvement, followed by PD+ patients. In contrast, PD- patients exhibited no improvement across trials, due to their lower tendency to exploit high-reward options.

Code
df_types_choices_subject <- dat %>% 
  filter(round %in% 2:9 & trial > 0) %>% 
  group_by(id, group, type_choice) %>% 
  summarise(n = n()) %>% 
  complete(type_choice, fill = list(n = 0)) %>% # turn implicit missing values into explicit missing values
  group_by(id, group) %>%
  mutate(prop = n / sum(n)) 

df_types_choices_overall <- df_types_choices_subject %>% 
  group_by(group, type_choice) %>% 
  summarise(n = n(),
            mean_prop = mean(prop),
            SD_prop = sd(prop),
            se_prop = SD_prop / sqrt(n),
            lower_ci_prop = mean_prop - qt(1 - (0.05 / 2), n - 1) * se_prop,
            upper_ci_prop = mean_prop + qt(1 - (0.05 / 2), n - 1) * se_prop)

ggplot(df_types_choices_overall) +
  facet_grid(~group) +
  geom_bar(aes(x = type_choice, y = mean_prop, fill = group, alpha = type_choice), stat = "identity", colour = 'black') +
  scale_y_continuous("Average proportion", limits = c(0,0.7), breaks = seq(0,1,.2), expand = c(0, 0), labels =   scales::percent_format(accuracy=1)) +
  scale_alpha_discrete(range = c(0.2,1)) +
  scale_x_discrete("Search decision") +
  scale_fill_manual(values=groupcolors) +
  scale_color_manual(values=groupcolors) +
  ggtitle("Types of consecutive search decisions") +
  geom_text(aes(x = type_choice, y = mean_prop, label = scales::percent(mean_prop, accuracy = 1)), vjust = -0.5, size = 3) + 
  theme_classic() +
  theme(aspect.ratio = 1,
        plot.title = element_text(hjust = 0.5),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        text = element_text(colour = "black"),
        strip.background = element_blank(),
        axis.text.x = element_text(colour="black"),
        axis.text.y = element_text(colour="black"),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())  

ggsave("plots/distance_types.png", dpi=300,height=3, width=5)
Figure 6: Types of search decisions in terms of distance.

An analysis of consecutive choice types over time reveals clear differences in search behavior between the groups. Both PNP and PD+ patients adapt their strategies as the round progresses by decreasing the number of local (distance = 1) and far (distance > 1) choices, while increasing the number of exploit decisions, indicating a shift from exploration to exploitation. Notably, the data indicate a faster shift to exploitation for PNP patients compared to PD+ patients, with an earlier and stronger preference for re-selecting known high-reward options. In contrast, PD- patients show limited adaptation, with the proportions of each decision type remaining relatively stable throughout the round, aside from a slight increase in exploit decisions.

Code
df_types_choices_trial_subject <- dat %>%
  filter(round %in% 2:9 & trial > 0) %>%
  group_by(id, group, trial, type_choice) %>%
  summarise(n = n()) %>%
  complete(type_choice, fill = list(n = 0)) %>%  # 
  group_by(id, group, trial) %>%
  mutate(prop = n / sum(n)) %>%  
  ungroup()


ggplot(df_types_choices_trial_subject, aes(x = trial, y = prop, color = type_choice, group = type_choice)) +
  facet_wrap(~group) +  
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2, alpha = 0.5, , position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "point", size = 2, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "line", position=position_dodge(width=0.5)) +  
  scale_fill_manual(values = choice3_colors) + 
  scale_color_manual(values = choice3_colors)+ 
  labs(
    x = "Trial",
    y = "Mean proportion (±95% CI)",
    title = "Types of choices over time",
    color = "Type choice"
  ) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.position = "inside", 
        legend.position.inside = c(0.15, 1),   
        legend.justification = c(1, 1),
        legend.title = element_blank(),
        legend.spacing.y = unit(0.05, 'cm'), 
        legend.key.height = unit(0.3, 'cm')  
  )    

Types of search decisions over time.

Types of search decisions over time.
Code
ggsave("plots/types_choice_by_trial.png", dpi=300, width = 8, height = 4)

4.5.2 Distance as function of previous reward

Finally, we analysed the relation between the value of a reward obtained at time \(t\) and the search distance on the subsequent trial \(t+1\). If a large reward was obtained, searchers should search more locally, while conversely, if a low reward was obtained, searchers should be more likely to search farther away.

Across all trials and rounds, search distance and previous reward were negatively correlated, indicating that participants tended to search further away following lower rewards compared to higher rewards. This relationship was stronger in PNP patients (\(r=-.44\), \(p<.001\), \(BF>100\)) and PD+ patients (\(r=-.34\), \(p<.001\), \(BF>100\)) compared to PD- patients (\(r=-.17\), \(p<.001\), \(BF>100\)). These findings suggest that PD patients off medication exhibited less adaptive search behavior than those on medication and individuals with polyneuropathies.

Code
# correlation of previous reward and distance of consecutive choices, by age group and environment
# overall, ignoring within-subject factor
# dat %>% 
#   filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
#   group_by(group) %>% 
#   summarise(corTestPretty(previous_reward, distance))

# mean correlation between distance and reward obtained on previous step
# first aggregated within each round and then within each subject
# such that there is one correlation for each subject

# reward_distance_cor <- dat %>% 
#   filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
#   group_by(id, round, group) %>% 
#   summarise(cor = cor(previous_reward, distance)) %>% 
#   mutate(cor = replace_na(cor, 0)) %>%  # in some rounds subjects clicked the same tile throughout; set cor=0
#   ungroup() %>% 
#   group_by(id, group) %>% 
#   summarise(mean_cor = mean(cor))

# mean correlation between distance and reward obtained on previous step as function of group
# reward_distance_cor %>% 
#   group_by(group) %>% 
#   summarise(n = n(),
#             m_cor = mean(mean_cor),
#             SD_cor = sd(mean_cor),
#             se_cor = SD_cor / sqrt(n),
#             lower_ci_cor = m_cor - qt(1 - (0.05 / 2), n - 1) * se_cor,
#             upper_ci_cor = m_cor + qt(1 - (0.05 / 2), n - 1) * se_cor)

#plot regression lines based on raw data
# ggplot(subset(dat, trial > 0 & round %in% 2:9), aes(x = previous_reward, y = distance, color = group)) +
#   facet_wrap(~group) +  
#   geom_jitter(alpha = 0.3, width = 0.1, height = 0.1) +  
#   geom_smooth(method = "lm", formula = y ~ x, se = TRUE) +  
#   
#   ggtitle("Regression Lines for Distance by Previous Reward and Group") +
#   theme_minimal() +
#   xlab("Previous Reward") +
#   ylab("Distance")

Given the nested structure of the data, we next employed a Bayesian hierarchical regression analysis to predict search distance based on the reward obtained in the previous step, with group and their interaction as population-level (fixed) effects and subject-wise random intercepts. These analyses show that both the magnitude of reward obtained on the last step and group influence search distance. Notably, PD patients off medication (PD-) adapted their search behavior less in response to reward magnitude, while patients on medication (PD+) exhibited adaptation levels close to to the PNP group.

Code
# for now, random intercepts only, Random intercept + random slope not stable
# lmer_distance_reward <- lmer(distance ~ previous_reward * group + (previous_reward + group | id), 
# data = subset(dat, trial > 0 & round %in% 2:9))
# fit model
lmer_distance_reward <- lmer(distance ~ previous_reward * group + (1 | id), 
                             data = subset(dat, trial > 0 & round %in% 2:9))

#summary(lmer_distance_reward)
#emmeans(lmer_distance_reward, pairwise ~ previous_reward | group, pbkrtest.limit = 15000)

p_lmer_distance_reward <- plot_model(lmer_distance_reward, type = "pred", terms = c("previous_reward", "group")) +
  stat_summary(dat, mapping=aes(x=previous_reward, y=distance, color=group, fill=group,shape = group), fun=mean, geom='point', alpha=0.7, size=1, na.rm = TRUE)+
  scale_x_continuous('Previous Reward', breaks = (c(0,10,20,30,40,50))) +
  ylab('Distance to Next Option')+
  scale_fill_manual(values=groupcolors) +
  scale_color_manual(values=groupcolors) +
  ggtitle('Search Distance ~ Previous Reward (lmer)') +
  theme_classic() +
  theme(legend.position = "inside", 
        legend.position.inside = c(0.85, 0.9),   # Use legend.position.inside
        legend.justification = c(1, 1),
        legend.title = element_blank(),
        legend.box.background =  element_blank(),
        legend.key = element_rect(fill = "white")) +
  guides(color = guide_legend(
    override.aes = list(
      fill = NA, 
      size = 2
    )))

# p_lmer_distance_reward$layers[[2]]$show.legend <- FALSE
# p_lmer_distance_reward

ggsave("plots/regression_distance_reward_lmer.png", p_lmer_distance_reward, dpi=300, height=3, width=4)
Code
# Bayesian regression analysis
# run_model() is a wrapper for brm models such that it saves the full model the first time it is run, otherwise it loads it from disk from directory `~brm`
# Fixed effects: previous_reward and group.
# Random effects: random slopes and a random intercept for both previous_reward and group by id, i.e., the effect of previous_reward and group can vary across individuals (id).

# random intercept and random slope
# brm_distance_reward <- run_model(brm(distance ~ previous_reward * group + (previous_reward + group | id), 

# random intercept                                     
brm_distance_reward <- run_model(brm(distance ~ previous_reward * group + (1|id),
                                     data=subset(dat, trial > 0 & round %in% 2:9 ),
                                     cores=1,
                                     seed = 0815,
                                     iter = 5000,
                                     warmup=1000,
                                     control = list(adapt_delta = 0.99, max_treedepth = 15)),
                                 #prior = prior(normal(0,10), class = "b")),
                                 modelName = 'brm_distance_reward')
#tab_model(brm_distance_reward, bpe="mean", title = "Bayesian regression results: Search distance as function of reward on previous step.") 
#bayes_R2(brm_distance_reward) 

# generate plot manually  predictions (otherwise difficult to plot the mean empirical values per geom_point)
prevReward <-  seq(0,50) #/ 50 # normalized reward
group  <-  levels(dat$group)
newdat <-  expand.grid(previous_reward=prevReward, group=group)

# predict distance based on previous reward
preds <-  fitted(brm_distance_reward, re_formula=NA, newdata=newdat, probs=c(.025, .975))
predsDF <-  data.frame(previous_reward=rep(prevReward, 3),
                       group=rep(levels(dat$group), each=length(prevReward)),
                       distance=preds[,1],
                       lower=preds[,3],
                       upper=preds[,4])

# average distance
grid  <-  expand.grid(x1=0:7, x2=0:7, y1=0:7, y2=0:7)
grid$distance <-  NA

for(i in 1:dim(grid)[1]){
  grid$distance[i] <- dist(rbind(c(grid$x1[i], grid$x2[i]), c(grid$y1[i], grid$y2[i])), method = "manhattan")
}

meanDist  <-  mean(grid$distance)

# plot predictions
ggplot() +
  stat_summary(dat, mapping=aes(x=previous_reward, y=distance, color=group, fill=group), fun=mean, geom='point', alpha=0.7, size=1, na.rm=T)+
  geom_line(predsDF, mapping=aes(x=previous_reward, y=distance, color=group), linewidth=1) +
  geom_ribbon(predsDF, mapping=aes(x=previous_reward, y=distance, ymin=lower, ymax=upper, fill=group), alpha=.3) +
  #geom_hline(yintercept=meanDist, linetype='dashed', color='red') + # mean distance
  # xlab('Normalized Previous Reward')+
  xlab('Previous Reward')+
  ylab('Distance to next chosen option')+
  scale_fill_manual(values=groupcolors) +
  scale_color_manual(values=groupcolors) +
  ggtitle('Search Distance ~ Previous Reward (brm)') +
  theme_classic() +
  theme(legend.position = "inside", 
        legend.position.inside = c(0.85, 0.9),   
        legend.justification = c(1, 1),
        legend.title = element_blank())          

ggsave("plots/regression_distance_reward_brms.png", dpi=300, height=3, width=4)
Figure 7

4.6 Bonus round judgments

In the bonus round, participants made 15 search decisions and then predicted the rewards for 5 randomly chosen, previously unobserved tiles. Subsequently, they chose one of the five tiles and continued the round until the search horizon of 25 clicks was met.

Data frame dat_bonus contains the following variables:

  • id: participant id
  • bonus_env_number: internal id of the bonus round environment
  • bonus_environment: recodes condition as Smooth (high spatial correlation)
  • x and y are the coordinates of the random tiles on the grid for whcih participants were asked to provide reward estimates
  • givenValue: participant reward judgment (scale 0-50)
  • howSecure: participant confidence for given reward judgment (scale 0-10)
  • chosen_x and chosen_y are the coordinates of the tile chose after making reward and confidence judgments for 5 random tiles
  • true_z is the ground truth, i.e. true expected reward of a tile
  • error is the absolute deviation between participants reward estimates (givenValue) and ground truth (true_z)
  • chosen is whether the option was chosen or not (participants chose one of the five options after estimating their value and confidence in their reward prediction)
Note

Charley: is this scaling still correct (taken from YKWG code)? bonus_environment$z <- bonus_environment$z * scale_factor + 5

Table 3: Bonus round data.
id bonus_env_number bonus_environment x y givenValue howSecure chosen_x chosen_y true_z chosen error group
111 38 Rough 5 6 20 5 7 3 16.34 not chosen 3.66 PNP
111 38 Rough 2 7 26 4 7 3 16.16 not chosen 9.84 PNP
111 38 Rough 7 3 16 5 7 3 38.27 chosen 22.27 PNP
111 38 Rough 0 7 28 3 7 3 23.99 not chosen 4.01 PNP
111 38 Rough 7 6 30 5 7 3 34.10 not chosen 4.10 PNP
115 39 Rough 0 0 19 4 3 1 25.86 not chosen 6.86 PNP

4.6.1 Prediction error

Figure 8 shows the mean absolute error between participants’ estimates and the true underlying expected reward, for each age group and environment. Compared to a random baseline, all groups performed better than chance level:

  • PNP: \(t(32)=-14.9\), \(p<.001\), \(d=2.6\), \(BF>100\)
  • PD+: \(t(28)=-9.8\), \(p<.001\), \(d=1.8\), \(BF>100\)
  • PD-: \(t(25)=-8.3\), \(p<.001\), \(d=1.6\), \(BF>100\)

There were no difference between groups:

  • PNP vs. PD+: \(t(60)=-1.2\), \(p=.221\), \(d=0.3\), \(BF=.49\)
  • PNP vs. PD-: \(t(57)=-2.6\), \(p=.012\), \(d=0.7\), \(BF=4.1\)
  • PD+ vs. PD-: \(t(53)=-1.2\), \(p=.249\), \(d=0.3\), \(BF=.48\)
Figure 8: Prediction error of bonus round judgments. The red dotted line indicates a random baseline.

4.6.2 Prediction error and confidence

Code
# Across all judgments and participants, there was no systematic relation between confidence and prediction error:
# corTestPretty(dat_bonus$error, dat_bonus$howSecure, method = "kendall") 
# cor.test(dat_bonus$error, dat_bonus$howSecure, method = "kendall") 
# correlationBF(dat_bonus$error, dat_bonus$howSecure, method = "kendall") 

A Bayesian regression with with prediction error as dependent variable, and confidence and group and their interaction as population-level (“fixed”) effects, and a random intercept for participants showed that for PNP patients confidence and predictione error were negatively correlated (i.e., lower confidence was associated with a higher error), whereas for the two Parkinson groups there was no relation.

Code
brm_bonus_confidence_error_by_group <- run_model(brm(error ~ howSecure * group + (1|id), 
                                                     data=dat_bonus, 
                                                     cores=1,  
                                                     control = list(adapt_delta = 0.99),
                                                     seed = 0815), 
                                                 modelName = 'brm_bonus_confidence_error_by_group')
#tab_model(brm_bonus_confidence_error_by_group, bpe = "mean", title = "Bayesian regression results: Prediction error and confidence") 
#bayes_R2(brm_bonus_confidence_error_by_group)

4.6.3 Analysis of selected tiles

To analyze selected and not-selected options, we first averaged the predicted reward and confidence of the not-chosen tiles within subjects, and then compared chosen and not chosen options. Selected tiles tended to have higher predicted rewards. Participants were not more confident in selected options, and selected tiles did not have a higher true reward than not selected tiles.

Code
# average not-chosen tiles within subjects first
df_chosen_overall <- dat_bonus %>% 
  group_by(id, chosen) %>% 
  summarise(m_givenValue = mean(givenValue),
            m_howSecure = mean(howSecure),
            m_true_z = mean(true_z))

# df_chosen_overall %>% 
#   group_by(chosen) %>% 
#   summarise(predicted_reward = mean(m_givenValue),
#             confidence = mean(m_howSecure),
#             true_reward = mean(m_true_z)) %>% 
#   kable(format = "html", escape = F, digits = 2) %>%
#   kable_styling("striped", full_width = F)

df_chosen_group <- dat_bonus %>% 
  group_by(id, group, chosen) %>% 
  summarise(m_givenValue = mean(givenValue),
            m_howSecure = mean(howSecure),
            m_true_z = mean(true_z))

# df_chosen_group %>% 
#   group_by(group,chosen) %>% 
#   summarise(predicted_reward = mean(m_givenValue),
#             confidence = mean(m_howSecure),
#             true_reward = mean(m_true_z)) %>% 
#   kable(format = "html", escape = F, digits = 2) %>%
#   kable_styling("striped", full_width = F)

# chosen vs not chosen: predicted reward

ggboxplot(df_chosen_group, x = "chosen", y = "m_givenValue",
          color = "group", palette =groupcolors, fill = "group", alpha = 0.2,
          add = "jitter", shape = "group", title = "Predicted reward of chosen vs. not chosen options",
          facet.by = "group") +
  ylab("Predicted reward") +
  xlab("") +
  stat_compare_means(comparisons = list( c("chosen", "not chosen") ), paired = T, method = "t.test", label = "p.format") +
  stat_summary(fun = mean, geom="point", shape = 23, fill = "white", size=3) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.title = element_blank()
  )

ggsave("plots/bonusround_chosen_not_chosen_options_predicted_reward.png", dpi=300, width=7, height = 5)


ggsave("plots/bonusround_chosen_not_chosen_options_confidence.png", dpi=300, width=7, height = 5)
Figure 9: Predicted reward of chosen vs. not chosen options in bonus round.
Code
# chosen vs not chosen: confidence
ggboxplot(df_chosen_group, x = "chosen", y = "m_howSecure",
          color = "group", palette =groupcolors, fill = "group", alpha = 0.2,
          add = "jitter", shape = "group", title = "Confidence of chosen vs. not chosen options",
          facet.by = "group") +
  ylab("Confidence in reward prediction") +
  xlab("") +
  stat_compare_means(comparisons =  list( c("chosen", "not chosen") ), paired = T, method = "t.test", label = "p.format") +
  stat_summary(fun = mean, geom="point", shape = 23, fill = "white", size=3) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.title = element_blank()
  )

ggsave("plots/bonusround_chosen_not_chosen_options_confidence.png", dpi=300, width=7, height = 5)
Figure 10: Confidence in reward prediction of chosen vs. not chosen options in bonus round.

5 Appendix

5.1 Distribution of BDI, MMSE, and YH in each group

Code
# dotplot BDI
# p_dotplot_BDI <- 
ggplot(dat_sample, aes(x = BDI, fill = group)) +
  facet_wrap(~group) +
  geom_dotplot(binwidth = 1, dotsize = 1) +
  scale_fill_manual(values = groupcolors) +
  scale_x_continuous("BDI score") + 
  scale_y_continuous(NULL, breaks = NULL) + 
  coord_fixed(ratio = 15) +
  theme_classic() +
  theme(
    legend.title = element_blank(),
    legend.position = 'none',
    strip.text = element_text(size=14),
    legend.text =  element_text(colour="black"),
    text = element_text(colour = "black"),
    strip.background =element_blank(),
    axis.text.x = element_text(colour="black"),
    axis.text.y = element_text(colour="black"),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank())

Code
# p_dotplot_MMSE <- 
ggplot(dat_sample, aes(x = MMSE, fill = group)) +
  facet_wrap(~group) +
  geom_dotplot(binwidth = 1, dotsize = 1) +
  scale_fill_manual(values = groupcolors) +
  scale_x_continuous("MMSE score") + 
  scale_y_continuous(NULL, breaks = NULL) + 
  coord_fixed(ratio = 15) +
  theme_classic() +
  theme(plot.title = element_text(hjust = 0.5, size = 10),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        text = element_text(colour = "black"),
        strip.background =element_blank(),
        axis.text.x = element_text(colour="black"),
        axis.text.y = element_text(colour="black"),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

Code
p_dotplot_HY <- ggplot(filter(dat_sample, group != "PNP"), aes(x = hoehn_yahr, fill = group)) +
  facet_wrap(~group) +
  geom_dotplot(binwidth = 1, dotsize = 1) +
  scale_fill_manual(values = groupcolors) +
  scale_x_continuous("Hoehn-Yahr score") + 
  scale_y_continuous(NULL, breaks = NULL) + 
  coord_fixed(ratio = 15) +
  theme_classic() +
  theme(plot.title = element_text(hjust = 0.5, size = 10),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        text = element_text(colour = "black"),
        strip.background =element_blank(),
        axis.text.x = element_text(colour="black"),
        axis.text.y = element_text(colour="black"),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

5.2 Performance as function of BDI, MMSE, and Hoehn-Yahr

5.2.1 Performance as function of depression score (BDI-II)

The plots show performance as function of depression score (BDI-II), separately for each group. Opposing trends were found in the different groups: for patients with polyneuropathy (PNP), there was a negative relation such that patients with higher depression scores obtained lower rewards. For the two Parkinson groups, the relation was positive, such that patients reporting more severe symptoms obtained higher rewards.

Code
df_correlation_groups_BDI <- df_mean_reward_subject %>%
  group_by(group) %>%
  summarise(correlation = cor(BDI, mean_reward, use = "complete.obs"))


ggplot(df_mean_reward_subject, aes(x = BDI, y = mean_reward)) +
  facet_wrap(~group,  scales = "free_x") +
  #geom_hline(data=filter(df_random_performance, environment=="Rough"), linetype="dotted", aes(yintercept=z_learn_envs)) + 
  #geom_hline(data=filter(df_random_performance, environment=="Smooth"), linetype="dotted",aes(yintercept=z_learn_envs)) +
  geom_smooth(colour = "black", linetype = "dashed", linewidth = 0.5, method="lm", se=T, alpha = 0.2) +
  #geom_jitter(aes(fill = BDI), shape = 21, alpha = jit_alpha, colour = "black") +
  geom_point(aes(fill = BDI), shape = 21, alpha = jit_alpha, colour = "black") +
  scale_fill_gradient(low="blue",high="red") +
  scale_y_continuous("Mean reward") + 
  scale_x_continuous("BDI score", breaks = c(0,5,10,15)) +
  coord_cartesian(xlim = c(0,15), ylim = c(20,40)) +
  ggtitle("Performance as function of depression score") +
  geom_text(data = df_correlation_groups_BDI, aes(x = 2, y = 20, label = paste0("italic(r) == ", round(correlation, 2))), parse = TRUE, inherit.aes = FALSE, size = 4) +
  theme_classic() +
  theme(aspect.ratio = 1,
        plot.title = element_text(size =16),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        strip.background = element_blank(),
        strip.text = element_text(size=14),
        axis.text = element_text(colour = "black"),
        axis.text.x = element_text(size=12),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

ggsave("plots/performance_groups_BDI.png", dpi=300, height = 3, width = 7)
Figure 11: Performance as function of depression score (BDI-II).

5.2.2 Performance as function of Mini-Mental State Examination (MMSE)

Higher values indicate better cognitive functioning.

Code
df_correlation_groups_MMSE <- df_mean_reward_subject %>%
  group_by(group) %>%
  summarise(correlation = cor(MMSE, mean_reward, use = "complete.obs"))

ggplot(df_mean_reward_subject, aes(x = MMSE, y = mean_reward)) +
  facet_wrap(~group,  scales = "free_x") +
  geom_smooth(colour = "black", linetype = "dashed", linewidth = 0.5, method="lm", se=T, alpha = 0.2) +
  geom_point(aes(fill = MMSE), shape = 21, alpha = jit_alpha, colour = "black") +
  scale_fill_gradient(low="blue",high="red") +
  scale_y_continuous("Mean reward") + 
  scale_x_continuous("MMSE score", breaks = c(27,28,29,30)) +
  coord_cartesian(xlim = c(27,30), ylim = c(23,40)) +
  ggtitle("Performance as function of mental status (MMSE)") +
  geom_text(data = df_correlation_groups_MMSE, aes(x = 27.5, y = 23, label = paste0("italic(r) == ", round(correlation, 2))), parse = TRUE, inherit.aes = FALSE, size = 3) +
  theme_classic() +
  theme(aspect.ratio = 1,
        plot.title = element_text(size=16),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        strip.background = element_blank(),
        strip.text = element_text(size=14),
        axis.text = element_text(colour = "black"),
        axis.text.x = element_text(size=12),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

ggsave("plots/performance_groups_MMSE.png", dpi=300, height = 3, width = 7)
Figure 12: Performance as function of Mini-Mental State Examination (MMSE).

5.2.3 Performance as function of Hoehn-Yahr (Parkinson patients only)

The Hoehn-Yahr scale provides basic information about the severity of motor impairments in Parkinson’s disease, with higher scores indicating greater severity.

Code
# separately for each group
df_correlation_groups_hoehn_yahr <- df_mean_reward_subject %>%  
  filter(group != "PNP") %>% 
  group_by(group) %>%
  summarise(correlation = cor(hoehn_yahr, mean_reward, use = "complete.obs"))

ggplot(filter(df_mean_reward_subject, group != "PNP"), aes(x = hoehn_yahr, y = mean_reward)) +
  facet_wrap(~group,  scales = "free_x") +
  #geom_hline(data=filter(df_random_performance, environment=="Rough"), linetype="dotted", aes(yintercept=z_learn_envs)) + 
  #geom_hline(data=filter(df_random_performance, environment=="Smooth"), linetype="dotted",aes(yintercept=z_learn_envs)) +
  geom_smooth(colour = "black", linetype = "dashed", size = 0.5, method="lm", se=T, alpha = 0.2) +
  geom_jitter(aes(fill = MMSE), shape = 21, alpha = jit_alpha, jit_width=0.0001, colour = "black") +
  #geom_point(aes(fill = hoehn_yahr), shape = 21, alpha = jit_alpha, colour = "black") +
  scale_fill_gradient(low="blue",high="red") +
  scale_y_continuous("Mean reward") + 
  scale_x_continuous("Hoehn-Yahr score", breaks = c(1,2,3)) +
  coord_cartesian(xlim = c(1,3), ylim = c(23,40)) +
  ggtitle("Performance as function of Hoehn-Yahr score") +
  geom_text(data = df_correlation_groups_hoehn_yahr, aes(x = 1.2, y = 23, label = paste0("italic(r) == ", round(correlation, 2))), parse = TRUE, inherit.aes = FALSE) +
  theme_classic() +
  theme(aspect.ratio = 1,
        plot.title = element_text(size=16),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        strip.background = element_blank(),
        strip.text = element_text(size=14),
        axis.text = element_text(colour = "black"),
        axis.text.x = element_text(size=12),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

Performance as function of Hoehn-Yahr scale (Parkinson patients only).

Performance as function of Hoehn-Yahr scale (Parkinson patients only).
Code
ggsave("plots/performance_groups_HY.png", dpi=300, height = 3, width = 5)

References

Abbott, A. (2010). Levodopa: The story so far. Nature, 466(7310), S6–S7.
Beck, A. T., Steer, R. A., Brown, G. K., et al. (1996). Beck depression inventory.
Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). “Mini-mental state”: A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3), 189–198.
Giron, A. P., Ciranka, S., Schulz, E., Bos, W. van den, Ruggeri, A., Meder, B., & Wu, C. M. (2023). Developmental changes in exploration resemble stochastic optimization. Nature Human Behaviour, 7(11), 1955–1967. https://doi.org/https://doi.org/10.1038/s41562-023-01662-1
Goetz, C. G., Poewe, W., Rascol, O., Sampaio, C., Stebbins, G. T., Counsell, C., Giladi, N., Holloway, R. G., Moore, C. G., Wenning, G. K., et al. (2004). Movement disorder society task force report on the hoehn and yahr staging scale: Status and recommendations the movement disorder society task force on rating scales for parkinson’s disease. Movement Disorders, 19(9), 1020–1028.
Hautzinger, M., Keller, F., & Kühner, C. (2006). Beck depressions-inventar (BDI-II). Harcourt Test Services.
Hoehn, M. M., & Yahr, M. D. (1967). Parkinsonism: Onset, progression, and mortality. Neurology, 17(5), 427–427.
Meder, B., Wu, C. M., Schulz, E., & Ruggeri, A. (2021). Development of directed and random exploration in children. Developmental Science, 24(4), e13095. https://doi.org/https://doi.org/10.1111/desc.13095
Sadeghiyeh, H., Wang, S., Alberhasky, M. R., Kyllo, H. M., Shenhav, A., & Wilson, R. C. (2020). Temporal discounting correlates with directed exploration but not with random exploration. Scientific Reports, 10(1), 4020.
Schulz, E., Wu, C. M., Ruggeri, A., & Meder, B. (2019). Searching for rewards like a child means less generalization and more directed exploration. Psychological Science, 30(11), 1561–1572. https://doi.org/10.1177/0956797619863663
Tambasco, N., Romoli, M., & Calabresi, P. (2018). Levodopa in parkinson’s disease: Current status and future developments. Current Neuropharmacology, 16(8), 1239–1252.
Wu, C. M., Meder, B., & Schulz, E. (2025). Unifying principles of generalization: Past, present, and future. Annual Review of Psychology, 76, 275–302. https://doi.org/https://doi.org/10.1146/annurev-psych-021524-110810
Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D., & Meder, B. (2018). Generalization guides human exploration in vast decision spaces. Nature Human Behaviour, 2, 915–924. https://doi.org/10.1038/s41562-018-0467-4